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Abstract. In runtime verification^ the central problem is to decide if 
a given program execution violates a given property. In online runtime 
verification, a monitor observes a program’s execution as it happens. If 
the program being observed has hard real-time constraints, then the 
monitor inherits them. In the presence of hard real-time constraints it 
becomes a challenge to maintain enough information to produce error 
traces^ should a property violation be observed. In this paper we introduce 
a data structure, called tree buffer, that solves this problem in the context 
of automata-based monitors: If the monitor itself respects hard real-time 
constraints, then enriching it by tree buffers makes it possible to provide 
error traces, which are essential for diagnosing defects. We show that 
tree buffers are also useful in other application domains. For example, 
they can be used to implement functionality of capturing groups in 
regular expressions. We prove optimal asymptotic bounds for our data 
structure, and validate them using empirical data from two sources: 
regular expression searching through Wikipedia, and runtime verification 
of execution traces obtained from the DaCapo test suite. 


1 Introduction 

In runtime verification, a program is instrumented to emit events at certain 
times, such as method calls and returns. A monitor runs in parallel, observes the 
stream of events, and identifies bad patterns. Often, the monitor is specified by 
an automaton (for example, see [1,2,8,12,22]). When the accepting state of the 
automaton is reached, the last event of the program corresponds to a bug. At this 
point, developers want to know how was the bug reached. For example, the bug 
could be that an invalid iterator is used to access its underlying collection. An 
iterator becomes invalid when its underlying collection is modified, for instance 
by calling the remove method of another iterator for the same collection. In 
order to diagnose the root cause of the bug, developers will want to determine 
how exactly the iterator became invalid. Of particular interest will be an error 
trace: the last few relevant events that led to a bug. In the context of static 
verification, error traces have proved to be invaluable in diagnosing the root cause 
of bugs [18]. However, runtime verification tools (such as [5,13,20]) shy away 
from providing error traces, perhaps because adding this functionality would 
impact efficiency. The goal of this paper is to provide the algorithmic foundations 
of efficient monitors that can provide error traces for a very general class of 
specifications. 



a 



Fig. 1. Two automata with relevant transitions in boldface. 


Nondeterministic automata provide a convenient specification formalism for 
monitors. They define both bugs and relevant events. Figure la shows an example 
automaton that specifies incorrect usage of an iterator: it is a bug if an iterator 
is created (event iter)^ and afterwards its next() method is called without a 
preceding call to hasNext(). Throughout the paper we assume that the user 
specifies which transitions are relevant. In most applications, there is a natural way 
to choose the relevant transitions. For example, in Figure la and in many other 
runtime verification properties, the natural choice are the non-loop transitions. 
Since the choice is natural, it can be automated; since the choice is dependent 
on application details, we do not focus on it. 

We have to consider nondeterministic automata in general. Nondeterministic 
finite automata allow exponentially more succinct specifications than deterministic 
finite automata. In addition, in the runtime verification context we must use 
an automaton model that handles possibly infinite alphabets. For most models 
of automata over infinite alphabets, the nondeterministic variant is strictly 
more expressive than the deterministic variant [3,15,25]. Thus, we must consider 
nondeterminism not only to allow concise specifications, but also because some 
specifications cannot be defined otherwise. 

Let us consider a concrete example: the automaton in Figure lb, consuming 
the stream of letters cahhcah. (We say stream when we wish to emphasize that the 
elements of the sequence must be processed one by one, in an online fashion.) One 
of the automaton computations labeled by cabbcab is 1 1 1 1 1 

1 —► 2 -► 3, where relevant transitions are bold. We say that the subsequence 
formed by the relevant transitions is an error trace; here, 1 —► 1 2 3. 

The main contribution of this paper is the design of a data structure that 
allows the monitor to do the following while reading a stream: 

1. The monitor keeps track of the states that the nondeterministic automaton 
could currently be in. Whenever the automaton could be in an accepting 
state, the monitor reports (i) the occurrence of a bug, and (ii) the last h 
relevant transitions of a run that drove the automaton into an accepting state. 
Here, is a positive integer constant that the user fixes upon initializing the 
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initialize(->.1) 
add_child(->-1, 1-^1) 
add_child(-> 1, 1-^2) 

DEACTIVATE (-► 1) 

add_child(1 -^2, 2 ->• 3) 
history(2 -► 3) 
DEACTIVATE (1 2) 



(b) 


(c) 


Fig. 2. Illustration of a monitor run of the automaton from Figure lb on the stream cab. 
Part (a) shows the monitor’s traversal of the automaton with some instrumentation. 
Part (b) shows the sequence of tree buffer operations that the monitor invokes. Part (c) 
shows the tree-buffer data structure that the monitor builds. 


monitor. Due to the nondeterminism, a bug may have multiple such error 
traces, but the monitor needs to report only one of them. 

2. The monitor processes each event in a constant amount of time^ thus paving 
the way for implementing real-time runtime verifiers that track error traces. 
(There is a need for real-time verifiers [21].) Not only the time is constant, 
but also not much space is wasted. Wasted space occurs if the monitor keeps 
transitions that are not among the h most recent relevant transitions. 

Due to the nondeterminism of the automaton, those constraints force the 
monitor to keep track of a tree of computation histories. For properties that 
can be monitored with slicing [22] the tree of computation histories has a very 
particular shape. That shape allows for a relatively straightforward technique for 
providing error traces, using linear buffers. However, it has been shown that some 
interesting program properties, including taint properties, cannot be expressed 
by slicing [1,9]. 

In this paper we provide a monitor for general nondeterministic automata, at 
the same time satisfying the properties 1 and 2 mentioned above. The single most 
crucial step is the design of an efficient data structure, which we call tree buffer. 
A tree buffer operates on general trees and may be of independent interest. 

Tree Buffers for Monitoring. A tree buffer is a data structure that 
stores parts of a tree. Its two main operations are add_child(x, ^), which 
adds to the tree a new node ^ as a child of node x, and history(x), which 
requests the h ancestors of x, where is a constant positive integer. For 
memory efficiency the tree buffer distinguishes between aetive and inaetive 
nodes. When add_child(x, or history(x) is called, node x must be active. 
In the case of add_child(x, ^), the new node y becomes active. There is 
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also a deactivate(x) operation with the obvious semantics. One of the main 
contributions of this paper is the design of efficient algorithms that provide 
the functionality of tree buffers with asymptotically optimal time and space 
complexity. More precisely, the add_child and deactivate operations take 
constant time, and the space wasted by nodes that are no longer accessible via 
HISTORY calls is bounded by a constant times the space occupied by nodes that 
are accessible via history calls. 

In the following, we give an example of how an efficient monitor operates, 
assuming that an efficient tree buffer is available. Consider the automaton from 
Figure lb and the stream cab . The monitor keeps pairs of (1) a current automaton 
state g, and of (2) a tree buffer node with the most recent relevant transition of 
a run that led to q . Initially, this pair is (1, ^1), as 1 is the initial state of the 
automaton (see Figure 2). 

Upon reading c, the automaton takes the transition 1-^1, and the monitor 
simulates the automaton by evolving from (1,1) to a new pair (!,->• 1): 
the first component remains unchanged because 1 1 is a loop; the second 

component remains unchanged because 1 1 is irrelevant. 

Next, a is read. The automaton takes transitions 1 1 and 1 2, both 

relevant. Corresponding to the automaton transition 1 1, the monitor evolves 

(!,-► 1) into a new pair (1,1 1): the first component remains unchanged 

because 1 1 is a loop; the second component changes because 1 1 is 

relevant Corresponding to the automaton transition 1 2, the monitor also 

evolves (1, “►!) into a new pair (2,1 2). Now that two relevant transitions 

were taken, they are added to the tree buffer: both 1 1 and 1 2 are 

children of -►1. Moreover, because -►1 is not anymore in any pair kept by the 
monitor, it is deactivated in the tree buffer. 

Next, b is read. The automaton takes transitions 1^1, 2 1, and 2 3. 

Out of the two transitions with the same target the monitor will pick only one to 
simulate, using an application specific heuristic. In Figure 2, the monitor chose to 
ignore 2 1. Moreover, because 1 —► 2 used to be in the monitor’s pairs before 

b was read but is not anymore, its corresponding tree buffer node is deactivated. 
Finally, since state 3 is accepting, the monitor will ask the tree buffer for an error 
trace, by calling history(2 3). 

In Figure 7 we provide pseudocode formalizing the sketched algorithm. 

2 Tree Buffers 

Consider a procedure that handles a stream of events. At any point in time the 
procedure should be able to output the previous h events in the stream, where 
is a fixed constant. Such linear buffers are ubiquitous in computer science, 
with applications, for example, in instruction pipelines [24], voice-over-network 
protocols [11], and distributed operating systems [14]. Linear buffers can be easily 
implemented using circular buffers ^ using 0 { h ) memory and constant update 
time, which is clearly optimal. 
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initialize(x) 

1 parent(x) := nil 

2 children{x) := 0 

3 Nodes {x} 

4 Aetive := {x} 

5 mem := 1 

6 memOld := 1 

ADD_CHILD(x,y) 

1 assert that x G Aetive and y 0 Nodes 

2 parent (y) := a: 

3 ehildren{x) ;= ehildren{x) + 1 

4 Nodes := Nodes U {y} 

5 Aetive Aetive U {y} 


DEACTIVATE(a;) 

1 Aetive Aetive — {x} 

history(x) 

1 assert that x G Aetive 

2 xs [] 

3 repeat h times, or until x = nil 

4 xs X ■ xs 

5 X := parent(x) 

6 return xs 

EXPAND(x,{yi,...,yn}) 

1 for z G {1,..., n} 

2 ADD_CHILD(x,yi) 

3 deactivate(x) 


Fig. 3. The naive algorithm. 


While this buffering approach is simple and efficient, it is less appropriate if 
the streamed data is organized hierarchically. Consider a stream of events, each 
of which contains a link to one of the previous events. We already saw an example 
of how such streams arise in runtime verification (Figure 2). But, there are many 
other situations where such streams could arise; for example, when trees such 
as XML data are transmitted over a network, or when recording the spawned 
processes of a parallel computation, or when recording Internet browsing history. 

A natural requirement for a buffer is to store the most recent data. For a 
tree this could mean, for example, the leaves of the tree, or the h ancestors of 
each leaf, where /z is a constant. Observe that a linear buffer does not satisfy 
such requirements, because an old leaf or the parent of a new leaf may have been 
streamed much earlier, so that they have been removed from the buffer already. 

A tree buffer is a tree-like data structure that satisfies such requirements. It 
supports the following operations: 

— INITIALIZE(x) initializes the tree with the single node x and makes x active 

— add_child(x, adds node ^ as a child of the active node x and makes y 
active 

— DEACTIVATE (x) makes X inactive 

— EXPAND (x, {^ 1 ,..., adds nodes ^i,..., ^^ as children of the active node x, 
makes x inactive, and makes yi,... ^y^ active 

— history(x) requests the h ancestors of the active node x, where /z is a 
constant positive integer 

A simple use case of a tree buffer consists of an initialize operation, followed by 
EXPAND operations with n > 0. In this case the active nodes are always exactly 
the leaves. 

The functionality of tree buffers is defined by the naive algorithm shown 
in Figure 3. The notation /(x) stands for the field / of the node x, while the 
notation f(x) stands for a call to function F with argument x. The field children 
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and the variables mem and memOld do not affect the behavior of the naive 
algorithm: they are used later. The assertions at the beginning of add_child 
and HISTORY detect sequences of operations that are invalid. For example, any 
sequence that does not start with a call to initialize is invalid. For such invalid 
sequences, tree buffer implementations are not required to behave like the naive 
algorithm. For valid sequences we require implementations to be functionally 
equivalent, albeit performance is allowed to be different. 

The naive algorithm is time optimal: initialize, add_child, and deactivate 
all take constant time; and history takes 0{h) time. However, it is not space 
efficient, as it does not take advantage of deactivate operations: it does not 
delete nodes that are out of reach of history. The challenge in designing tree 
buffers lies in preserving both time and space efficiency. On the one hand, it 
is not space efficient to store the whole tree. On the other hand, it is not time 
efficient to exactly identify the nodes that must be stored. 

3 Space Efficient Algorithms 

The naive algorithm is time efficient but not space efficient. This section presents 
several other algorithms. First, if each deactivate is followed by garbage 
collection, then the implementation becomes space efficient but not time efficient. 
Second, if deactivate is followed by garbage collection only at certain times, 
then the implementation becomes both space and time efficient, but only in an 
amortized sense. Third, we present an algorithm that is both space and time 
efficient in a strict sense. The last algorithm is somewhat sophisticated, and 
its correctness requires a non-obvious proof. The implementation of all four 
algorithms, which fully specifies all the details, is available online [10]. 


3.1 The Garbage Collecting Algorithm 

A space optimal implementation uses no more memory than needed to answer 
HISTORY queries. To make this precise, let us define the height of a node x to 
be the shortest distance from x to an active node in the subtree of x, were we 
to use the naive algorithm. Active nodes have height 0. A node with no active 
node in its subtree has height oo. Let Hi be the set of nodes with height i, and 
let H^i be the set of nodes with height less than i. 

The memory needed to answer history queries is i7(|iL</i|), and the gc 
algorithm of Figure 4 achieves this bound. On line 5 of GC, the list Level 
represents and Seen represents iL<i. Thus, on line 13, the list Level 

represents and Seen represents The procedure delete_parent 

implements a reference counting scheme. 

Let us consider a sequence of add_child and deactivate operations, coming 
after initialize. We call add_child and deactivate modifying operations. Let 
be the Hi corresponding to the tree obtained after k modifying operations, 
and let Sgc^ be the space used by the gc algorithm after k modifying operations. 
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GC() DELETE_PARENT(y) 


1 

Seen := GOPY_OF(Actz^;e) 

1 

X := parent (y) 

2 

Level GONVERT_TO_LlST(Actz^;e) 

2 

if X ^ nil 

3 

i 1 

3 

children{x) := children{x] 

4 

while i < h and Level is nonempty 

4 

if children{x) = 0 

5 

NextLevel := [] 

5 

delete_parent(j:) 

6 

for y G Level 

6 

delete x 

7 

X parent{y) 

7 

mem := mem — 1 

8 

if X ^ Seen 

8 

parent(y) := nil 

9 

10 

Seen {a:} U Seen 

NextLevel x • NextLevel 

ABB .CHILD {x,y) 

11 

Level := NextLevel 

1 

assert that x G Active 

12 

i i-\-l 

2 

parent (y) := x 

13 

for y G Level 

3 

children{x) := children{x) + 1 

14 

DELETE_PARENT(y) 

4 

Active Active U {y} 


5 mem := mem + 1 


deactivate(x) 

1 Active Active — {x} 

2 GC() 

Fig. 4. The gc algorithm. The tree buffer operations initialize, expand, and history 
are those defined in Figure 3. 


Proposition 1. Consider the gc algorithm from Figure J^. The memory used 
after k modifying operations is optimal: s^ G The runtime used to 

process k modifying operations is 0{k^). 

The space bound is obvious. For the time bound, the following sequence ex¬ 
hibits the quadratic behavior: initialize(O), add_child(0, 1), add_child(0, 2), 
deactivate(2), add_child(0,3), add_child(0,4), deactivate(4), ... 

3.2 The Amortized Algorithm 

Our aim is to mitigate or even solve the 
time problem of the gc algorithm, but 
to retain space optimality up to a con¬ 
stant. One idea is to invoke the garbage 
collector rarely, so that the time spent 
in garbage collection is amortized. To 
this end, we call GC when the number 
of nodes in memory has doubled since 
the end of the last garbage collection. 

We obtain the amortized algorithm from 
Figure 5. It is here that the counters 
mem and memOld are finally used. 

The following theorem states that the amortized algorithm is space efficient, 
by comparing it with the gc algorithm, which is space optimal. As before, let us 


ADD_GHILD(x,y) 

1 assert that x G Active 

2 parent (y) := x 

3 children{x) := children{x) + 1 

4 Active Active U {y} 

5 mem := mem + 1 

6 if mem = 2 • memOld 

7 GG() 

8 memOld := mem 

Fig. 5. The amortized algorithm. The tree 
buffer operations initialize, deactivate, 
EXPAND, HISTORY are those defined in Fig¬ 
ure 3. The subroutine GG is that defined in 
Figure 4. 
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consider a sequence of modifying operations. We write si^o for the space used 
by the amortized implementation after the first k operations. Call a sequence 
of operations extensive if every deactivate( x) is immediately preceded by an 
add_child(x, for some y. For example, a sequence is extensive if it consists of 
an INITIALIZE operation followed by expand operations with n > 0. 

Theorem 2. Consider the amortized algorithm in Figure 5. A sequenee of i 
modifying operations takes 0{£) time. We have Samo ^ 0( iLLaxj</i; Sgc^) for all 
k < i. If the sequence is extensive then si^o ^ 0(«5g?) for all k < £. 

Loosely speaking, the theorem says that the space wasted in-between two garbage 
collections is bounded by the space that would be needed by the space optimal 
implementation at some earlier time, up to a constant. It also says that the time 
used is optimal for a sequence of operations. 


3.3 The Real-Time Algorithm 

In general, interactive applications should not have amortized implementations. 
Interactive applications include graphical user interfaces, but also real-time 
systems and runtime verification monitors for real-time systems. More generally 
speaking, the environment, be it human or machine, does not accumulate patience 
as the time goes by. Thus, time bounds that apply to each operation are preferable 
to bounds that apply to the sequence of operations performed so far. 

The difficulty of designing a real-time algorithm stems from the fact that 
whether a node is needed depends on its height, but the heights cannot be 
maintained efficiently. This is because one deactivate operation may change 
the heights of many nodes, possibly far away. 

The key idea is to under-approximate the set of unneeded nodes; that is, to 
find a property that is easily computable, and only unneeded nodes have it. To 
do so, we maintain three other quantities instead of heights. The depth of a node 
is its distance to the root via parent pointers, were we to use the naive algorithm. 
The representative of a node is its closest ancestor whose depth is a multiple 
of h. The active count of a node is the number of active nodes that have it as 
a representative. Unlike height, these three quantities — depth, representative, 
active count — are easy to maintain explicitly in the data structure. The depth 
only needs to be computed when the node is added to the tree. The representative 
of a node is either itself or the same as the representative of its parent, depending 
on whether the depth is a multiple of h. Finally, when a node is deactivated 
(added to the tree, respectively), only one active count changes: the active count 
of the node’s representative is decreased (increased, respectively) by one. 

The active count of a representative becomes 0 only if its height is at least /i, 
which means it is unneeded to answer subsequent history queries. Thus, the 
set of nodes that are representatives and have an active count of 0 constitutes 
an under-approximation of the set of unneeded nodes. The resulting real-time 
algorithm appears in Figure 6. 
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INITIALIZE(a;) 

1 Active {a:} 

2 parent(x) := nil 

3 children{x) := 0 

4 depth{x) := 0 

5 rep{x) x 

6 cnt{x) 1 

process_queue() 

1 if queue is nonempty 

2 X := DEQUe() 

3 cut_parent(x) 

4 delete x 

deactivate(x) 

1 Active Active — {a:} 

2 cnt{rep{x)) cnt{rep[x)) — 1 

3 if children{x) = 0 

4 ENQUE(a;) 

5 if cnt{rep{x)) = 0 

6 CUT .PARENT (rep (x)) 

7 process_queue() 


ADD_CHILD(x,p) 

1 assert that x G Active 

2 assert that cnt{y) = children{y) = 0 

3 Active Active U {y} 

4 parent (y) := x 

5 children{x) := children{x) + 1 

6 depth{y) := depth{x) + 1 

7 if depth{y) = 0 (mod h) 

8 rep(p) := p 

9 else 

10 rep{y) := rep(x) 

11 cnt(rep(p)) := cnt(rep(p)) + 1 

12 process_queue() 

cut_parent(p) 

1 X := parent(y) 

2 if X 7^ nil 

3 children{x) children{x) — 1 

4 if children{x) = 0 and a: 0 Active 

5 enque(x) 

6 parent(y) := nil 


Fig. 6. The real-time algorithm. The tree buffer operations EXPAND and HISTORY are 
those defined in Figure 3. The enque and deque operations are the standard operations 
of a queue data structure. 


As DELETE_PARENT did in the gc algorithm, the function deactivate im¬ 
plements a reference counting scheme, using children as the counter. Unlike the 
gc algorithm, the node is not deleted immediately, but scheduled for deletion^ 
by being placed in a queue. This queue is processed whenever the user calls 
ADD.CHILD or DEACTIVATE. When the queue is processed, by PROCESS_QUEUE, 
one node is deleted from memory, and perhaps its parent is scheduled for deletion. 

The proof of the following theorem, provided in Appendix B.2, is subtle. 
Similarly as before, we write for the space that the real-time algorithm has 
allocated and not deleted after k operations. 

Theorem 3. Consider the real-time algorithm from Figure 6, and a sequenee 
of i modifying operations. Every operation takes 0(1) time. We have s[^^ G 
0(maxj</c for all k < i. If the sequenee is extensive then s[^^ G O(sgc^) for 
all k < i. 

4 Monitoring 

Consider a nondeterministic automaton A = {Q, F^Si^Sr), where Q is a set 
of states, E is the alphabet of events, go ^ Q is the initial state, E C Q contains 
the accepting states, and 6i,6r E Q x E x Q are, respectively, the irrelevant 
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and the relevant transitions. We aim to construct a monitor that reads a stream 
of events and reports an error trace when an accepting state has been reached. 
Since A is in general nondeterministic and there are both irrelevant and relevant 
transitions, building an efficient monitor for A is not straightforward. We have 
sketched in the introduction how to use a tree buffer for such a monitor. The 
algorithm in Figure 7 makes this precise. 

The main invariants (line 4) are the following: 

— If the pair (q, node) is in the list now, then HlSTORY(node) would return the 

last < h relevant transitions of some computation q of A, where w is 

the stream read so far. 

— If there is a computation q^ q of A, then, after reading w, a pair {q, node) 
is in the list now, for some node. 

A node x is created and added to the tree buffer when a relevant transition is 
taken (lines 10-11). The node x is deactivated (line 19) when and only when it 
is about to be removed from the list now (line 20), since neither add_child(x, •) 
nor HISTORY (x) can be invoked later. 

In the following subsections we give two applications for this monitor. The 
location, which accompanies events (lines 5 and 10), is application dependent. 
For regular expression searching, the location is an index in a string; for runtime 
verification, the location is a position in the program text. 


4.1 Regular-Expression Searching 

We show that regular-expression searching with capturing groups can be im¬ 
plemented by constructing an automaton with irrelevant and relevant transi¬ 
tions, and then running the monitor from Figure 7. Suppose we want to search 
Wikipedia for famous people with reduplicated names, like ‘Ford Madox Ford’. 
One approach is to use the following (Python) regular expression: 

Ford(^[A-Z] [a-z] *){m,n}^Ford (1) 

This expression matches names starting and ending with ‘Ford’, and with at 
least m and at most n middle names in-between. The parentheses indicate so- 
called capturing groups: The regular-expression engine is asked to remember (and 
possibly later output) the position in the text where the group was matched. 
We can implement this as follows. First, we compile the regular expression with 
capturing groups into an automaton with relevant and irrelevant transitions. 
Which transitions are relevant could be determined automatically using the 
capturing groups, or the user could specify it using a special-purpose extension 
of the syntax of regular expressions. Whenever the automaton takes a relevant 
transition, the position in the text should be remembered. Then we run the 
monitor from Figure 7 on this automaton. In this way we can output the last h 
matches of capturing groups. In contrast, standard regular-expression engines 
would report only the last occurrence of each match. In the example expression (1), 
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monitorO 


1 

2 

3 

4 

5 

6 

7 

8 
9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 
21 
22 


root-node := make_node(->^o, nil) 

mYTiAiAZE{root-node) 
now, nxt := [(go, root-node)], [] 

forever 

a, loeation := get_next_event_and_location() 
for each {q, parent) in the list now 

for each a-labeled transition t — {q ^ q) G (5^ l±) (5r 
if ^ in - nxt { q ') 
if t G (5r 

ehild := MAKE_NODE(t, loeation) 

ABB-CHILD {parent, ehild) 
if t e Si 

ehild := parent 
append {q', ehild) to nxt 
in-nxt{q'), in-nxt{ehild) := true, true 
if q' EF 

REPORT-ERROR{mSTORY {ehild)) 
for each {q, node) in the list now 

if ^in-nxt {node) then deactivate (node) 
now, nxt := nxt, [] 
for each {q, node) in the list now 

in-nxt{q), in-nxt{node) false, false 


Fig. 7. A monitor for the automaton A — {Q, E,qo, F,Si,Sr). The monitor reports 
error traces by using a tree buffer. 


they would report only the last of Ford’s middle names. One would have to unroll 
the expression n times in order to make a standard engine report them all. 

For the regular expression (1), we remark that any equivalent deterministic 
automaton has Q{2'^) states, so nondeterminism is essential for feasibility^. 


4.2 Runtime Verification 

For runtime verification we use the monitor from Figure 7 as well, in the way 
we sketched in the introduction. Clearly, for real-time runtime verification the 
real-time tree buffer algorithm needs to be used. 

We have not yet emphasized one feature of our monitor, which is essential for 
runtime verification: The automaton A = {Q, E, go, F, Si, Sr) may have an infinite 
set Q of states, and it may deal with infinite event alphabets E. Note that we did 
not require any finiteness of the automaton for our monitor. We can implement the 

^ We use a large value for m when we want to find people with reduplicated names that 
are long. By searching Wikipedia with large values for m we found, for example, ‘Jose 
Marfa del Carmen Francisco Manuel Joaquin Pedro Juan Andres Avelino Cayetano 
Venancio Francisco de Paula Gonzaga Javier Ramon Bias Tadeo Vicente Sebastian 
Rafael Melchior Caspar Baltasar Luis Pedro de Alcantara Buenaventura Diego Andres 
Apostol Isidro’ (a Spanish don). 
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I(k) = {{iter, /c)} 

H{k) = {{hasNext,k)} 

N{k) = {{next, k)} 

0{k) = {{other, k)} 

^ = UkevsUrk)OH{k)UN{k)UO{k)) 

X{k) =A- H{k) - N{k) 

Y{k) = A- N{k) 


Fig. 8. The configuration graph of Figure la. The arcs are labeled by sets of events, 
meaning that there is one transition for each event in the set. The picture shows only 
three values from Value = {1, 2, 3,... } 


monitor from Figure 7, as long as we have a finite description of A, which allows 
us to loop over transitions (line 7) and to store individual states and events. One 
can view this as constructing the (infinite) automaton on the fly. For instance, the 
event alphabet could be x Value, where U = {iter, hasNext, next, other} 

and Value is the set of all program values, which includes integers, booleans, 
object references, and so on. There are various works on automata over infinite 
alphabets and with infinitely many states. In those works, infinite (-state or 
-alphabet) automata are usually called configuration graphs, whereas the word 
automaton refers to a finite description of a configuration graph. In contrast to 
the rest of the paper, we use that terminology in the rest of this paragraph. Often 
there exists an explicitly defined translation of an automaton to a configuration 
graph (for example, for register automata [15], class memory automata [3], and 
history register automata [25]). Even when the semantics are not given in terms 
of a configuration graph, it is often easy to devise a natural translation. For 
example, the configuration graph in Figure 8 is obtained from the automaton of 
Figure la using an obvious translation that would also apply in the case of data 
automata [6] and in the case of slicing [22]. 

5 Experiments 

This section complements the asymptotic results of Section 3 with experimental 
results from three data sets. The implementation, datasets, and experimental 
logs are available online [10]. 


5.1 Datasets 

1. The first dataset is a sequence of n = 10^ operations that simulate a sequence 
of linear buffer operations. That is, we called the tree buffer as follows: 
initialize(O); expand(0, {1}); ... ; EXPAND(n - 1, {n}). 
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2. We produced (manually) the automaton in Figure 9 from the regular expres¬ 
sion ‘. *a(^* [" ]) {8}^*a’, and ran the monitor from Section 4 on the text 
of Wikipedia. This dataset contains 7 • 10^ tree buffer operations. 



* 


Fig. 9. A nondeterministic automaton without a small, deterministic equivalent: It 
finds substrings that contain 10 non-space characters, the first and last of which are ‘a’. 
The structure of the automaton is similar to the one corresponding to the regular 
expression from Section 4.1. 


3. We ran the monitor from Section 4 on infinite automata alongside the 
DaCapo test suite. The property we monitored was specified using a TOPL 
automaton [9], and it was essentially the one in Figure la: it is an error 
if there is a next without a preceding hasNext that returned true. We 
used the projects avrora (simulator of a grid of microcontrollers), eclipse 
(development environment), fop (XSL to PDF converter), h2 (in memory 
database), luindex (text indexer), lusearch (text search engine), pmd (simple 
code analyzer), sunfiow (ray tracer), tomcat (servlet server), and xalan (XML 
to HTML converter) from version 9.12 of the DaCapo test suite [4]. This 
dataset contains 8 • 10^ tree buffer operations. 


5.2 Empirical Results 

We measure space and time in a way that is machine independent. For space, 
there is a natural measure: the number of nodes in memory. For time, it is 
less clear what the best measure is: We follow Knuth [17], and count memory 
references. 


Runtime versus History. Figure 10 gives the average number of memory references 
per operation. We observe that this number does not depend on h, except for 
very small values of h, thus validating the asymptotic results about time from 
Section 3. Figure 13 in Section A confirms that the gc algorithm is much slower 
than the others. 


Runtime Variability. Figure 11 shows that for the amortized and gc algorithms 
there exist operations that take a long time. In contrast, the plots for the naive 
and the real-time algorithms are almost invisible because they are completely 
concentrated on the left side of Figure 11. 
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Fig. 10. The average number of memory references per tree buffer operation. 
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Fig. 11. Histogram for the number of memory references per operation, for h = 100. 





(a) as linear buffers 


(b) regular expression searching 


(c) runtime verihcation 


Fig. 12. How much space is necessary. 
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Memory versus History. In Figure 12, we notice that the memory usage of the 
amortized and the real-time algorithms is within a factor of 2 of the memory usage 
of the gc algorithm, thus validating the asymptotic results about space from 
Section 3. The naive algorithm is excluded from Figure 12 because its memory 
usage is much bigger than that of the other algorithms. 


6 Conclusions, Related Work, and Future Work 

We have designed tree buffers^ a data structure that generalizes linear buffers. 
A tree buffer consumes a stream of events each of which declares its parent to 
be one of the preceding events. Tree buffers can answer queries that ask for the 
h ancestors of a given event. Implementing tree buffers with good performance 
is not easy. We have explored the design space by developing four possible 
algorithms (naive, gc, amortized, real-time). Two of those are straightforward: 
naive is time optimal, and gc is space optimal. The other two algorithms are 
time and space optimal at the same time: amortized is simpler but not suitable 
for real-time use, and real-time is more involved but suitable for real-time use. 
Proving the amortized and the real-time algorithms correct requires some care. We 
have validated our algorithms on data sets from three different application areas. 

Algorithms that process their input in a gradual manner have been studied 
under the names of online algorithms, dynamic data structures, and, more recently, 
streaming algorithms. These algorithms address different problems than tree 
buffers. For example, streaming algorithms [7,19] fall into two classes: those 
that process numeric streams, and those that process graph streams. Graph 
streaming algorithms are concerned with problems such as: ‘Are vertices u and v 
connected in the graph described so far?’ One of the basic tools used for answering 
such questions are link-cut trees [23]. Yet, like all the existing graph streaming 
algorithms, link-cut trees do not give more weight to the recent parts of the tree, 
in the way tree buffers do. Such a preference for recent data has been studied 
only in the context of numeric streams. For example, the following problem has 
been studied: ‘Which movie is most popular eurrentlyl^ [19, Section 4.7] 

The closest relatives of tree buffers remain the simple and ubiquitous linear 
buffers. Since tree buffers extend linear buffers naturally, it is easy to imagine a 
wide array of applications. We have discussed an engine for regular expression 
searching as one example. The main motivation of our research is to enhance 
runtime verifieation monitors with the ability to provide error traces, fulfilling 
real-time constraints if needed, and covering general nondeterministic automata 
specifications. We have described this application in detail. 

Several automata models that are used in runtime verification, including 
the TOPL automata used in our implementation, are nondeterministic [9,12,22], 
which led us to a tree data structure that can track such automata. Some 
automata models are even more general, such as quantified event automata [1] and 
alternating automata [8]. The construction of error-trace providing monitors for 
such automata is an intriguing challenge that seems to raise further fundamental 
algorithmic questions. 
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Fig. 13. The average number of memory references per tree buffer operation. Unlike 
Figure 10, these plots include the gc algorithm. 


B Proofs 

All results talk about sequences of modifying operations, but this is without 
loss of generality: (1) any call to HISTORY takes 0(1) space and 0{h) time in all 
algorithms; (2) any call to EXPANd(x, ..., i/n}) is equivalent to the segment 
of operations 

add_child(x,^i); ... ; add_child(x,^ n); deactivate(x) 

Given these observations, we can use the results from below to deduce the space 
and time usage of any sequence of operations. 

The following lemma about extensive sequences will be used in the proofs of 
Theorems 2 and 3. 

Lemma 4. Consider an extensive sequenee of i operations. Let n > 1. Then for 
all i^j with 0<i<j<£ we have — 1 < |^<nl* 

Proof We first establish these two facts: 

\Hg\-l<\W^P'>\ hr0<i<i (2) 

hr0<i<i-l (3) 

For (2), we do a case analysis on the {i + l)th operation. The interesting case 
is that in which the {i + l)th operation is a deactivate(x), for some x. Because 
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the sequence is extensive, the ith operation must be add_child(x, ^), for some y. 
Consider now an arbitrary node z G i^<n- By the definition of there must 

exist an active node u such that 2 : = parent^(u)^ for some k < n. If u ^ x, then 
u remains active after the deactivate(x) operation, and hence ^ G If 

n = X, then ^ = parent^~^^ (y). In this case, if /c + 1 < n, then again 2 : G 
Thus, there is at most one element of that might not belong to 
namely parent'^~^ {x). We proved (2). 

For (3), note that in an extensive sequence at most one of the (i + l)th 
and (i + 2)th modifying operations is a deactivate. Given (2) and given that 
ADD_CHILD increases by 1 the number of active nodes, (3) follows. 

Now, take i and j such that i < j. By repeated application of (3) we know 
that for all p such that 0 < i + 2p < ^. In particular, either 

|^<nl ^ \^<n\ or |^<i| < first casc we are done; in the second 

case we find the desired result by using (2). □ 

B.l Proof of Theorem 2 

Theorem 2. Consider the amortized algorithm in Figure 5. A sequenee of i 
modifying operations takes 0{i) time. We have simo ^ 0( maxj</c for all 
k <£. If the sequence is extensive then 5amo ^ O(sgc^) for all k < i. 

A garbage collection cycle is a segment a of some sequence of modifying 
operations such that 

— the first operation of a follows immediately after an operation that triggered 
a garbage collection, or after initialize; and 

— the operations of a do not trigger a garbage collection, except possibly the 
last operation. 

We begin by proving the following lemma. 

Lemma 5. There exists a constant c such that the runtime of any garbage 
collection cycle a is at most c • k, where k is the length of a. 

Proof Recall the implementation from Figure 5. Each modifying operation that 
does not trigger the garbage collector takes < ci time, for some constant ci. 
Thus, if a does not trigger the garbage collector then its runtime is < Ci • /c. 
It remains to check the case in which the last operation of a does trigger the 
garbage collector. 

The time spent in the garbage collector is < C 2 • mem, for some constant C 2 . 
In order to find an upper bound for mem, we make two observations: 

— when the garbage collector is triggered, mem = 2 • memO/d, and 

— the number mem — memOld of nodes added to the tree is the number of 
ADD_CHILD operations in a which in turn is at most k 
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Combining these two observations we get that mem < 2 • k. 

We can now compute a bound for the total runtime of a: 

Cl • /c + C2 • mem < ci • /c + C2 • (2 • /c) = (ci + 2c2) • k 

Thus, c := Cl + 2 c 2 has the required property. □ 

Now we prove Theorem 2. 

Proof (of Theorem 2). Consider any sequence a of ^ modifying operations. First 
we prove the statement on time complexity. The sequence a can be decomposed 
into garbage collection cycles. Applying Lemma 5 to each garbage collection 
cycle, and summing up the runtimes, we obtain that a takes at most c • £ time. 
This is 0{t) time. 

Next we prove the statements on space complexity. Pick an arbitrary k < i. 
Let ko > 0 he the largest number so that ko < k and either /cq = 0 or the koth 
operation triggered a garbage collection. For any i > 0 write mem^^^ for the value 
of mem after the ith operation. The garbage collection ensures mem^^^^ = 
Further, the implementation of add_child ensures mem^^^ < 2 • mem^^^\ and 
so mem^^^ < 2 • For all i we have Samo ^ 0{mem^^^) and 5gc G 0(|iL^*^|). 

It follows 5amo G 0(sgc°^) and hence 5amo G 0(maxj</i; 5gc^), which is the first of 
the two statements on space complexity. For the second one, assume that a is 
extensive. By Lemma 4 we have \H^l \ > — 1, so 

mem« < 2 . (|ifW| + l) . 

and hence si^o G 0(5gc^). □ 

B.2 Proof of Theorem 3 

In the following, consider the tree obtained in the reference implementation after 
a fixed sequence of modifying operations. By Nodes we denote the set of nodes 
of the tree. The following lemma states a monotonicity property of \Hi\: 

Lemma 6. We have \Hi\ > for all i > 0. As a consequence, we have 

\H^2h\<2\H^hV 

Proof. Denote by parent : Nodes Nodes the partial function that assigns 
to a node its parent; parent(x) is undefined for the root x. Extend parent to 
parent : ^ in the standard way. Then we have C parent (Hi) 

and \Hi\ > \parent{Hi)\. The statement follows. □ 

Let the level of node x, denoted by level{x), be \^depth{x)/h\. A node x 
is called recent if there exists an active node y in the subtree of x such that 
level(x) > level{y) — 1. Let R denote the set of recent nodes. 

Lemma 7. We have R C H^ 2 h‘ 
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Proof. We pick an arbitrary x e and show that x G H^ 2 h- 

Because x is recent, there exist an active node y and an integer k > 0 such 
that level(x) > level{y) — 1 and x = parent^{y). Thus, 


depth{x) 

> 

depth{y) 

- 1 = 

depth{x) P k — h 

h 


h 


h 


In general, if [a/h\ > [b/h\ then b — a < h. In our case, k — h < so k < 2h. In 
other words, if ^ is a witness for x e R, then y is also a witness for x G □ 

A node x is said to be a fringe node when depth{x) = 0 (mod h) and 
ent{x) = 0. A node x is said to be a doomed node when it is inactive and each 
of its children is either a fringe node or a doomed node. Let D denote the set 
of doomed nodes. It is easy to check that the real-time algorithm schedules for 
deletion (and then deletes) only doomed nodes. 

Lemma 8. Every node is either doomed or reeent: Nodes = i? l+l T). 

Proof We prove first that a node that is not doomed must be recent; we will 
later prove that a recent node must be not doomed. 

Let X be a node that is not doomed. If there exists an active node y in the 
subtree of x such that level{x) = level(y), then x is recent. Thus, for what follows, 
assume that no such node y exists. In this case, we will prove by induction on 
k := h — (^depth{x) mod h) that there exists a node z in the subtree of x such 
that level{x) = level{z) — 1, and hence x is, again, recent. Note that 1 <k <h. 

The base case is /c = 1. By the definition of doomed, x is active, or it has 
a child u that is not doomed and not fringe. If x were active, then we could 
take y := X] so x must be inactive. Because /c = 1, it must be that depth(u) = 0 
(mod h). Since u is not fringe, it must be that ent{u) > 0. Hence, there exists an 
active node z and an integer 0 < I < h such that u = parent\z). We have that 
level{x) = level{u) — 1 = level{z) — 1, and so z has the desired properties. 

For the induction step case, pick an arbitrary k such that 1 < k < h. As 
above, x must be inactive, and must have a child u that is not doomed and 
not fringe. In addition, level(x) = level(u)^ because of the limits on k. By the 
induction hypothesis, there exists an active node z in the subtree of u such 
that level{u) = level{z) — 1. This node z is also in the subtree of x, and indeed 
level{x) = level{z) — 1. 

We conclude that if a node is not doomed then it is recent. 

For the other direction, let x be a recent node. By the definition of recent, 
there exists an active node y in the subtree of x such that level(x) > level{y) — 1. 
Let k be an integer such that x = parent^ {y)^ and consider the path from y 
to X, excluding x\ parent^ {y)^ parent^ {y)^^ parent^~^{y). None of these nodes 
is a fringe node: A fringe node would have to be in a different level than the 
active node but that would force level (x) < level {y) — 1. We can thus prove by 
induction that all these nodes are not doomed: parent^ (y) is not doomed because 
it is active, and parent^~^^ (y) is not doomed because parent\y) is not doomed 
and not fringe for 0 < / < /c. In fact, the induction from above also established 
that X is not doomed. 
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We conclude that if a node is recent then it is not doomed. 


□ 


In the following we consider a sequence of I modifying operations. We write 
Qf recent nodes after k operations, and for the set of nodes 

in memory after k operations, i.e., nodes that have been added but not (yet) 
deleted by the real-time algorithm. 

Lemma 9. For all k < £: 

(a) We have C 

(b) If 7 ^ 0, then the queue is nonempty after k operations. 

Proof. For point (a). Lemma 8 together with the observation that only doomed 
nodes are scheduled for deletion suffice. For point (b), observe that the imple¬ 
mentation uses a reference counting scheme that directly mirrors the definition 
of doomed nodes. □ 

Lemma 10. We have k < i. If the sequence is 

extensive then \ < \II^^ 2 h\ k < i. 

Proof. We proceed by induction on k. The base case {k = 0) is trivial. Let 
0 < k <£. If R^^^ = then we have = R^^^ C by Lemma 7. 

Hence By applying the induction hypothesis, it follows < 

maxj</e So assume for the rest of the proof that the inclusion R^^^ C 

from Lemma 9 (a) is strict. Then, by Lemma 9 (b), the queue is not empty after 
k operations. So the kth operation deletes from memory a node in the queue, 
and we have: 

In either case we have By applying the induction hypothesis, 

it follows < max 7 </c 

Assume for the rest of the proof that the sequence is extensive. Let the kth 
operation be an add_child. Then we have: 

\Mik)\ < IffWj , 

where the last inequality is because no node is deactivated in the kth operation. 
Let the kth operation be a deactivate. Then we have: 

/,N (4) /, -,N ind. hyp. x Lemma 4 /,x 

|M«| < 

This concludes the proof. □ 


if the kth operation is an add_child 
if the kth operation is an deactivate 
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Now we can prove Theorem 3: 

Theorem 3. Consider the real-time algorithm from Figure 6, and a sequenee 
of i modifying operations. Every operation takes 0(1) time. We have G 
0(maxj</c for all k < i. If the sequenee is extensive then s[^^ G O(sgc^) for 
all k < i. 

Proof. By combining Lemmas 10 and 6, < 2mdiXj^j^ \^<h\ k < i. 

If the sequence is extensive then for all k < i. The theorem 

follows, as 5gc^ G □ 
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